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Abstract 

Background: RNA-RNA interaction plays an important role in the regulation of gene expression and cell 
development. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing 
stable interactions with it. In the RNA-RNA interaction prediction problem, two RNA sequences are given as inputs 
and the goal is to find the optimal secondary structure of two RNAs and between them. Some different algorithms 
have been proposed to predict RNA-RNA interaction structure. However, most of them suffer from high 
computational time. 

Results: In this paper, we introduce a novel genetic algorithm called GRNAs to predict the RNA-RNA interaction. The 
proposed algorithm is performed on some standard datasets with appropriate accuracy and lower time complexity in 
comparison to the other state-of-the-art algorithms. In the proposed algorithm, each individual is a secondary structure 
of two interacting RNAs. The minimum free energy is considered as a fitness function for each individual. In each 
generation, the algorithm is converged to find the optimal secondary structure (minimum free energy structure) of two 
interacting RNAs by using crossover and mutation operations. 

Conclusions: This algorithm is properly employed for joint secondary structure prediction. The results achieved on a 
set of known interacting RNA pairs are compared with the other related algorithms and the effectiveness and validity 
of the proposed algorithm have been demonstrated. It has been shown that time complexity of the algorithm in each 
iteration is as efficient as the other approaches. 

Keywords: RNA secondary structure, Minimum free energy. Fitness function 



Background 

Major successes have been achieved in the treatment 
of some cancers, including colon, breast and pancre- 
atic by suppressing the gene expression involved in 
the development of these diseases using RNA-RNA in- 
teraction. The interaction between two RNAs is known 
as the newest and the most efficient method for gene 
silencing. It has been shown that the small interfering 
RNAs (siRNAs) can be used for silencing their target 
mRNAs [1]. Furthermore, small RNAs (sRNAs) play an 
important role in the regulation of gene expression. 
They usually bind to their target mRNAs to prevent 
their translation [2]. 
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In RNA-RNA Interaction Prediction (RRIP) problem, 
two RNA sequences are given as inputs and the goal 
is to find the minimum free energy Secondary Struc- 
ture of Interacting RNAs (SSIR). To tackle this prob- 
lem, some algorithms have been proposed by research 
groups. Andronescu et al. [3] proposed a method based 
on dynamic programming in which two RNA sequences 
are concatenated as a single sequence and its secondary 
structure is calculated [3]. Another approach calculates 
the partition function of a secondary structure complex of 
multiple interacting RNAs [4]. This method rigorously 
extends those models of secondary structure to the multi- 
stranded case. The tools such as RNAhybrid [5], UNAFold 
[6] and RNAduplex from ViennaRNA package [7] reduce 
computational time complexity by ignoring all the internal 
base pairings in both RNAs. RNAup [8,9] extends the 
standard partition function approach to RNA secondary 
structures and employs the single (unpaired) regions on 
each RNA to find the interaction between them. RNAplex 
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[10,11] finds the possible hybridization sites for RNA in 
the large RNA databases based on a sUght simplification 
of the energy model. In this model, the loop energy is 
assumed to be a function of the loop size. 

Recently, a novel algorithm based on the multiple con- 
text free grammars was introduced in [12]. Accordingly, 
two real values called transition and emission probabil- 
ities are specified for each rule of the grammar. Then, a 
derivation tree is constructed for the grammar based on 
the rules with high probability. 

In heuristic based approaches, inRNAs [13] firstly 
predicts the loop regions in the native structure of each 
sequence, and then finds the optimal non-conflicting 
interaction between two RNAs. IntaRNA [14] combines 
the accessibility of target sites as well as the existence of 
a user-definable seed to find RNA-RNA interaction. Min- 
imizing the joint free energy between two RNA molecules 
under a number of energy models with growing com- 
plexity was introduced in [2]. Another interesting heuristic 
approach for this problem was presented in [15]. This 
algorithm employs some dot matrices representation of all 
possible base pairs for finding the secondary structure of 
each RNA and between the two RNAs. 

An approximation algorithm was presented in [1], 
where an RNA-RNA interaction graph is created in 
which every edge represents a possible bond in or be- 
tween two RNAs. A set of edges is found to maximize 
the number of bonds. A statistical sampling algorithm 
was introduced in [16] based on some modifications to 
the grammars. It calculates the interaction probabilities 
for any given single region on RNA. RactIP [17] predicts 
RNA-RNA interaction using integer programming. Ac- 
cordingly, it uses the approximate information of the 
internal and external base pairing probabilities of joint 
structures as an objective function of integer program- 
ming. PETcofold [18] employs covariance information 
in the internal and external base pairs to predict SSIR of 
two multiple alignments of RNA sequences. InteRNA 
[19] reduces the time and space complexity of RRIP 
problem described by Alkan et al. [2] using dynamic 
programming sparsification. 

One of the pitfalls of the most existing algorithms is 
their high computational time to predict RNA-RNA 
interaction, while a number of them have not been 
performed on some RNA pairs to predict binding sites be- 
tween two single regions of RNAs. Alkan et al. [2] proved 
that RNA-RNA interaction prediction is an NP-complete 
problem. 

In this paper, we propose a new genetic algorithm 
called GRNAs as an appropriate solution for the RRIP 
problem. This algorithm can be performed on some 
standard RNA pairs with high accuracy. In this method, at 
first, all possible stems in each RNA as well as all possible 
hybrid regions between two RNAs are extracted from a 



dot matrix. The initial population consists of some indi- 
viduals, where each of them is an SSIR obtained from 
some randomly extracted stems and hybrid regions of the 
dot matrix. The minimum free energy is computed for 
each individual as a fitness value. For each generation, 
some individuals are selected to mate based on their 
fitness values and form a new population. Then, mutation 
operation is done on a few individuals. The population 
generation terminates when the free energy of an indi- 
vidual is minimum enough. Finally, one of the best indi- 
viduals is selected as an optimal SSIR. The algorithm is 
conducted on some real datasets and compared with some 
other algorithms to investigate efficiency and validity of 
the proposed method. The time and space complexity of 
the proposed method in each iteration is 0(/^ + \P\), where 
/ and l^l indicate the sum of the length of two RNAs and 
the length of an individual, respectively. The results show 
that the accuracy of the algorithm is as efficient as the 
other related methods. 
The rest of this paper is organized as follows. In Section 

2, some definitions and notations are described. In Section 

3, a genetic algorithm called GRNAs is presented to pre- 
dict RNA-RNA interaction. The results and conclusion 
are discussed in Sections 4 and 5, respectively. 

Definitions and notations 

An RNA molecule is composed of a long, usually single- 
stranded chain of nucleotide units; adenine (A), cytosine 
(C), guanine (G) and uracil (U). Thus, R = rir2 ■■■ r„ in 
5'-3' direction is an RNA sequence, where |7?|=« 
and riS{A,C,G,U\ (1 <i < n). The RNA structure is 
formed by the creation of hydrogen bonds between 
Watson-Crick complementary bases {A - U and C - G) 
and a Wobble base pair (G - U). 

In an RNA secondary structure, each base interacts 
with at most one other base, and no base pairs cross 
each other. Two bases r, and r, (1 <i<j< n) of the base 
pair (r„ r,) are represented by ' (' and ') ', respectively and 
each unpaired base is declared by '. '. A stem consists of 
subsequent base pairs and a loop is one sequence of 
consecutive unpaired bases. 

A secondary structure of two interacting RNAs, 
and 7?2i contains the set of stems in each RNA and the 
hybrid regions between two RNAs as well as loops. Each 
hybrid region consists of subsequent hybrid base pairs 
between two RNAs. Two bases r, e Ri and r, e i?2 of 
the hybrid base pair (r„ rj) are represented by ' [' and '] ', 
respectively. 

Example. Let 7?i = CGGUUUGAGGUCCG and 7?2 = 
ACUACCGAAAAGUU be two RNA sequences. The SSIR 
of the two RNAs is shown as follows; 

5' - CGGUUUGAGGUCCG - 3' 5' - ACUACCGAAAAGUU - 3' 

({{[[[■■[[[))) ({{]]]■■]]]))) 
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In this example, each RNA has one stem. In the left 
hand RNA, one stem is found by the production of 
bonding between CGG and the reverse CCG (GCC). 
There are two hybrid regions between the sequences Ri 
and 7?2- The first one is produced by binding between 
UUU and the reverse AAA (AAA). The second one is 
generated by binding between GGU and the reverse 
ACC (CCA). 

A new genetic algorithm for RNA-RNA interaction 
prediction 

Genetic algorithm is an optimization method based on 
evolutionary biology that is widely used to solve search 
and optimization problems [20-22]. In this section, a 
new genetic algorithm, GRNAs, is presented to predict 
RNA-RNA interaction. In the following, initial popula- 
tion, fitness function, crossover and mutation operations 
are introduced. 

Initial population 

In the proposed algorithm, two RNA sequences R ' = 
r\r'2-r'n (|^'1=«) and i? " = r " ir " 2 ... r " (|^"l = 
m) are given as inputs. The two RNAs R ' and 7? " are con- 
verted to the sequence R = r-ji ... as follows; 




!</<«, 
n + 2<i<l, 
i = « + 1, 



where N is an arbitrary character to distinguish between 
two sequences and I = m + n + 1. 

A dot matrix Mf^^ is made, where the axes in the dot 
matrix correspond to the two sequences R and reverse 
i?,as follows; 

r lif (r„r,_;+i)e{(A,£/),(£/,^),(C,G), 

^''[',;1=< {G,C),{U,G),{GM)}, 
\ 0 else, 

where r, and r^.y+i (1 < i, j < I) are the i-t\\ and /-;'+l-th 
nucleotides in the sequence R = rir2...ri, respectively. 
Each right-skewed consecutive value of I's which is 
parallel to the main diagonal in the dot matrix is 
selected as a sub-diagonal. Each sub-diagonal shows a 
possible stem in each RNA or hybrid region between 
two RNAs. Set -D* shows all sub-diagonals in the dot 
matrix as follows; 

if = {< i,j,t>\ l<i<lkl<j<lM<t<l-l}, 

where / and / indicate the start position of the row and 
the column of a sub-diagonal with t+1 consecutive I's, 



respectively. Hence, each <i,J,t> is a set of consecutive 
base pairs as follows; 

< i,j,t >= {(r,-,r/_y+i), (ri+f,r/_y_j+i)}, 

According to the prior knowledge, we know that 
Watson-Crick base pairs occur more than Wobble in 
RNA structures. In this regard, we compute the percent 
of G-U pairs on our dataset approximated 14%. So, G-U 
pairs are removed from the sub-diagonals including more 
than 14% G-U pairs. For each di = < ii,Ji, ti> s D and 
d-2 = < '2.72) t2> & D^, di d2 is defined as; 

di'<d2 = {(ri,rf)edi\3{ri,,rg)^d2.i = k\/i = gV; = kwj = g}, 

where di « d2 represents all base pairs in di overlapping 
with d2. 

For each individual P, a |Z>*| -tuple is randomly made 
as; 

/ =< Xi,X2, >, -Xk^iO, 1}, 

where Xk (1 < k < indicates the k-th sub-diagonal 

in /. In other words, individual P contains those sub- 
diagonals that their related in / is equal to 1. Here, 
Xk= 1 means sub-diagonal dk e P, while Xk = 0 points to 
dk € P. Then, the individual P is constructed as follows; 

p = {Ck\ xk = i,Ck^4>}, (1) 

where the set Q (modified k-th sub-diagonal) is obtained 
as follows; 

Ck = { (r,-, ry) edk}- \Ji<t<k-i dk^Ct. 

Here, Q- e P is a set of base pairs in d/^ without any 
common base pairs in the previous sub-diagonals, d^l < 
t <k), of individual P. Finally, Ck is modified by removing 
the lonely base pairs from it as follows; 

Ck = Q.-{ [rt, rj) e Ck\ (r/+i, ry_i) (J Q &(r,-_i, ry+i) « Ck}. 

Notice that a set of the produced individuals creates 
an initial population. 

Fitness function 

For each individual P, let S and H represent two RNAs 
secondary structures and binding sites between the two 
RNAs, respectively. Therefore, the fitness function is 
defined as follows; 

Fitness{P) = MFE{S) + MFE{H), 

where for C e {S, H], MFE(C) denotes the minimum free 
energy of structure C. We apply RNAeval.exe [7] to com- 
pute minimum free energies of secondary structures and 
binding sites separately. 



Montaseri ef al. Algorithms for Molecular Biology 201 4, 9:1 7 
http://www.almob.Org/content/9/1/17 



Page 4 of 7 



Crossover 

Crossover operation is performed between the indi- 
viduals with the rate of 0.9. The good and mediocre 
individuals are transferred to the next population. The 
remaining individuals are consecutively selected for cross- 
over operation. 

Let Pi and P2 are selected as parents. For each individ- 
ual Pi, l<i<2, a \D'^\ -tuple is defined as follows: 

f 1 if qePi, 
Ii=<xa,xa,-,x,\^<^ >, Xii=^Q ^^^^ 

In this procedure, a random position A:, 1 < /c< |D*|, is 
selected and Ii and I2 are crossed. Then, / ' 1 and / ' 2 are 
formed as follows: 

I[ =< xn,xi2, ■■■,Xik,X2(it+i), •■•j^2ji/j >' 

/'j =< ^21,^22, ■■■,X2k,XHk+l), > • 

In the following, two new individuals P ' 1 and P ' 2 are 
generated from / ' j and / ' 2 similar to the described me- 
thod in the initial population (refers to the Equation 1), 
respectively. 

Mutation 

Mutation operation is done with the rate of 0.1 on a 
few randomly selected individuals in each generation. 
Assume that P is an individual selected for mutation. 
For the individual P, a |D*| -tuple is obtained by: 

r_ _ / 1 CteP, 

1 -<X^,X2,---,X^j^^ >, |o ^i^^ 

An item Xj, where Xj = 1 is randomly selected and 
replaced to 0. Then, another Xk {xk = 0) is replaced to 1. 
The new individual, P, is obtained from the changed / 
based on the proposed method in initial population 
(refers to the Equation 1). Finally, if Ck€P (all the base 
pairs of dk have overlapping with the existence sub- 
diagonals in P), the other Xi {xi = 0) is selected to replace 
with 1. This process continues until Ck&P or the de- 
fined number of generations is reached. When mutation 
is performed on a number of individuals, they will be 
increasingly sorted based on their fitness values. 

Termination of the GRNAs algorithm 

The GRNAs algorithm terminates when the best indi- 
vidual in definite generations will not be changed or the 
defined number of generations be reached. After the 
termination of the algorithm, one of the best individuals 
is selected as the best folding of two RNAs and the best 
interaction between the two RNAs. 



Time and space complexity 

We have obtained the time and space complexity of 
GRNAs in each iteration. Making the dot matrix needs 
the complexity of 0{l ) where / exposes sum of the 
length of the two RNAs. Let h and |P| be the number of 
individuals and the length of an individual P. The time 
complexity of creating the initial population is 0{h. \ 
P\ We set /j = 40 and \P\ = max{|7? '\,\R"\},soh can 
be ignored. Sorting individuals based on their fitness 
values requires 0{\P\.h.\ogh). Crossover and mutation 
operations take 0{\P\ ^) and 0(|P|), respectively. Thus, 
the time complexity in each iteration in the proposed al- 
gorithm is 0{h. \P\{\P\ + log h)). The maximum number 
of iteration is at most / = 20. Therefore, the time com- 
plexity of the algorithm is 0(7. h. \P\{\P\ + log h) + that 
is simplified with 0(/^ + |P|). 

On the other hand, for storing the h individuals of 
length \P\ we need 0{h. \P\) space complexity. Further- 
more, the population in the algorithm uses both dot 
matrix and an array of sub-diagonals. Hence, the storage 
complexity of these two types is 0(/^), where / denotes 
sum of the length of the two RNAs. Thus the total space 
complexity of GRNAs is 0(/z|P| + P') which is simplified 
withO(/^+|P|). 

Results and discussion 

The GRNAs has been performed on a machine with 
two-Core Intel(R) Duo processor T6670 2.20 GHz and 
4 GB RAM to predict the interaction structure between 
two RNAs. The proposed genetic algorithm is performed 
on two well-known datasets of RNA-RNA interactions. 
The first set contains: Rlinv-R2inv, Tar-Tar', DIS-DIS, 
CopA-CopT and IncRNA54-RepZ in the Escherichia 
coli bacteria [12]. The joint secondary structures of 
this dataset include kissing hairpins. We evaluate the 
performance of joint secondary structure prediction of 
this dataset. 

Also, this algorithm is carried on the second set of 
datasets with their binding sites including some RNA 
pairs called: DsrA-Rpos, GcvB-argT, GcvB-dppA, GcvB- 
gltl, GcvB-livK, GcvB-livJ, GcvB-oppA, GcvB-STM4351, 
IstR-tisAB, MicA-ompA, MicA-lamB, MicC-ompC, MicF- 
ompF, OxyS-fhlA, RyhB-sdhD, RyhB-sodB, SgrS-ptsG and 
Spot42-galK [14]. This dataset is used to appraise the per- 
formance of RNA-RNA interaction prediction in binding 
sites. 

To evaluate the prediction accuracy of the GRNAs, 
F-measure (f) and Matthews Correlation Coefficient 
(MCC) [18] are calculated using sensitivity {Sn) and posi- 
tive predictive value (PPV). Assume that the number of 
correctly predicted base pairs, the number of false pre- 
dicted base pairs and the number of unpredicted base 
pairs are indicated by TP, FP and FN, respectively. So, Sn, 
PPV, F, and AfCC are defined as follows: 
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Table 1 The results of joint secondary structure prediction of GRNAs in MCC in comparison to the PETcofold and other 
joint structure prediction methods such as RNAcofold, inteRNA, Pairfold and RactIP 



RNA-RNA pairs GRNAs PETcofold RNAcofold inteRNA Pairfold RactIP 



MicA-ompA 


67 


87 


80 


49 


86 


57 


OxyS-fhIA 


60 


80 


61 


64 


61 


48 


RyhB-uof-fur 


56 


13 


21 


12 


21 


19 


RyhB-sodB 


74 


67 


65 


70 


65 


65 


Average 


64 


62 


57 


49 


58 


47 



Sn= TP/{TP + FN), PPV = TP/{TP + FP), 

F={2xSnx PPV) /{Sn + PPV), 

MCC = VSn X PPV. 

Table 1 shows the results of joint secondary structure 
prediction of our algorithm, GRNAs, in Matthews 
Correlation Coefficient [18]. It is also compared to the 
state-of-the-art methods such as PETcofold [18], the 
sparsified version of inteRNA [19], Pairfold [3], RactIP 



[17] and RNAcofold [7]. The MCC evaluates the joint 
structure, i.e. both the binding sites between the two 
RNAs and the secondary structure of each single RNA. 
In two pairs MicA-ompA and OxyS-fhlA, PETcofold 
has the best MCC value and in other two pairs RyhB-uof- 
fur and RyhB-sodB, GRNAs has the highest MCC value. 

We also compared GRNAs with four state-of-the-art 
methods: inRNAs, IntaRNA, RNAup and RactIP. Table 2 
shows the results of prediction in binding sites in sen- 
sitivity and positive predictive values on the datasets 



Table 2 The results of binding sites prediction of GRNAs in sensitivity and positive predictive value on the datasets 
[12,14] in comparison to inRNAs, IntaRNA, RNAup and RactIP 



Sensitivity PPV 



RNA-RNA pairs 


GRNAs 


inRNAs 


IntaRNA 


RNAup 


RactIP 


GRNAs 


inRNAs 


IntaRNA 


RNAup 


RactIP 


Tar-Tar* 


100 


100 


100 


100 


81.5 


100 


83.3 


83.3 


833 


57.9 


R1 inv-R2inv 


100 


100 


100 


100 


100 


100 


77.8 


100 


77.8 


100 


DIS-DIS 


100 


100 


100 


100 


75 


100 


100 


100 


100 


783 


CopA-CopT 


8947 


88.9 


100 


55.6 


100 


80.95 


82.8 


39.1 


65.2 


100 


lncRNA54-RepZ 


7144 


100 


73.8 


75 


100 


100 


88.9 


85 


85.7 


833 


DsrA-Rpos 


73.08 


80.8 


80.8 


80.8 


654 


100 


77.8 


77.8 


77.8 


73.9 


GcvB-argT 


87.5 


95 


95 


90 


95 


100 


86.4 


95 


94.7 


100 


GcvB-dppA 


94.12 


100 


100 


100 


94.1 


100 


85 


58.6 


45.9 


593 


GcvB-gltl 


91.67 


75 


0 


0 


100 


95.65 


50 


0 


0 


100 


GcvB-livK 


70.83 


54 


54.2 


54.2 


95.5 


8947 


57 


56.5 


56.5 


95.5 


GcvB-livJ 


95.45 


634 


95.5 


95.5 


95.8 


95.45 


824 


95.5 


95.5 


95.8 


GcvB-oppA 


90.91 


100 


100 


100 


100 


100 


73.3 


95.7 


95.7 


100 


GcvB-STM4351 


72 


76 


76 


88 


88 


100 


100 


90.5 


95.7 


100 


IstR-tisAB 


69.44 


72.2 


87.9 


66.7 


77.8 


100 


100 


96 


100 


100 


MicA-ompA 


93.75 


100 


100 


100 


87.5 


100 


100 


100 


100 


87.5 


MicA-lamB 


82.61 


100 


100 


82.6 


56.5 


9048 


100 


82.1 


704 


86.7 


MicC-ompC 


90.91 


100 


100 


72.7 


72.7 


100 


100 


53.7 


41 


88.9 


MicF-ompF 


100 


96 


96 


80 


83.3 


100 


96 


96 


95.2 


76.9 


OxyS-fhIA 


80 


813 


50 


37.5 


56.3 


86.96 


100 


100 


100 


81.8 


RyhB-sdhD 


58.82 


61.8 


58.8 


794 


824 


95.24 


95.5 


100 


794 


824 


RyhB-sodB 


100 


100 


100 


100 


100 


100 


100 


81.8 


90 


39.1 


SgrS-ptsG 


60.87 


56.6 


73.9 


73.9 


83.9 


87.5 


76.5 


100 


100 


100 


Spot42-galK 


61.36 


43.2 


40.9 


523 


68.2 


87.1 


76 


64.3 


523 


69.8 


Average 


84.1 


84.5 


81.9 


77.6 


84.7 


96.03 


86.5 


80.5 


784 


85.1 
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[12,14] using the proposed approach and mentioned 
methods. Here, only external base pairs are considered 
to measure accuracy. According to the second half of 
Table 2, the average positive predictive value on datasets 
[12,14] is 96.03%. This table shows that our method is 
comparable with the existing methods. Table 3 indicates 
the accuracy of GRNAs in F-measure with considering 
binding sites. In GRNAs, the average F-measure is 89%. 
The results shown in Tables 2 and 3 indicate that GRNAs 
works as efficient as the other methods in average sensi- 
tivity, positive predictive value and F-measure. 

Our genetic algorithm randomly selects the sub- 
diagonals to make individuals. Therefore different individ- 
uals with variety sub-diagonals are constructed. Due to 
the nature of proposed genetic approach, some of the 
RNA-RNA interactions can be predicted more accurate 
than the other algorithms. For example the accuracy rate 
of Tar-Tar* is obtained 100%, while maximum accuracy of 
the other approaches is 90.9%. 

We compare the computational time complexity of 
GRNAs and state-of-the-are methods. The time and space 

Table 3 The results of binding sites prediction of GRNAs 
in F-measure on the datasets [12,14] in comparison to 
InRNAs, IntaRNA, RNAup and RactIP 

RNA-RNA pairs GRNAs inRNAs IntaRNA RNAup RactIP 

67.7 
100 
76.6 
100 
90.9 
694 
974 
72.2 
100 
95.5 
95.8 
100 
93.6 
78.5 
87.5 
68.4 
80 
80 
66.7 
824 
56.3 
85 
69 
83.6 



complexity of several algorithms (TIRNA [15], App 
(approximation algorithm to predict SSIR) [14], ripalign 
[23], and other methods in the Tables 1, 2 and 3) are given 
in Table 4. As it is shown, the time complexity of GRNAs 
in each iteration is 0(1^+ \P\), where / and |P| indicate 
sum of the length of the two RNAs and the length of an 
individual, respectively. Also, Space complexity of the 
proposed method is 0(/^ + \P\). 

Conclusion 

In this paper, a new genetic algorithm was introduced 
for solving RNA-RNA interaction prediction problem. In 
this algorithm, all possible stems in each RNA and hy- 
brid regions between two RNAs are extracted from a dot 
matrix showing all possible base pairs. Initial population 
is formed based on some stems and hybrid regions of 
the dot matrix. Minimum free energy is considered as a 
fitness function. Crossover operation is done between 
some consecutive individuals in the population. Muta- 
tion is taken on a few randomly selected individuals. 
Population generation continues until the minimum free 
energy of the best individual becomes minimal enough. 
Finally, one of the best individuals is selected to form 
RNA-RNA interaction structure. The proposed algo- 
rithm was tested on several RNA-RNA interaction data- 
sets. The experimental results indicate a high accuracy 
of GRNAs. Furthermore, time and space complexity of 
GRNAs is as efficient as the other related studies. 

Availabiiity 

The program of GRNAs is available at http://mostafa.ut. 
ac.ir/grnas. 



Table 4 Comparison of time and space complexity of 
some algorithms 



Algorithm 


Time complexity 


Space complexity 


GRNAs 


0{P + \P\) 


0(F + \p\) 


TIRNA 


Oik" log k^) 


Oik") 


SPM 




0{nW) 


LM 




0(nW) 


inRNAs 


0{l<^w) 


Oik") 


RNAup 


O(n^m) 


0{n^) 


EBM 




0(nW) 


App 




0{nW) 


Pairfold 


0{^) 


Oik") 


IntaRNA 


0(nm + nh 


0{nm) 


ripalign 


Oihf") 




PETcofold 


0{Mlh 




RactIP 


0(n=) 





Here, / and !P| indicate the sum of the lengtli of two RNAs and the length of 
an individual, respectively. 



Tar-Tar* 


100 


R1 inv-R2inv 


100 


DIS-DIS 


100 


CopA-CopT 


85 


lncRNA54-RepZ 


82.92 


DsrA-Rpos 


84.44 


GcvB-argT 


93.33 


GcvB-dppA 


96.97 


GcvB-gltl 


93.62 


GcvB-livK 


76.07 


GcvB-livJ 


95.45 


GcvB-oppA 


95.24 


GcvB-STM4351 


83.72 


IstR-tisAB 


81.96 


MicA-ompA 


96.77 


MicA-lamB 


86.36 


MicC-ompC 


95.24 


MicF-ompF 


100 


OxyS-fhIA 


83.33 


RyhB-sdhD 


72.73 


RyhB-sodB 


100 


SgrS-ptsG 


71.79 


Spot42-galK 


72 


Average 


89 



90.9 


90.9 


90.9 


87.5 


100 


87.5 


100 


100 


100 


85.7 


56.2 


60 


94.1 


79 


80 


79.3 


79.3 


79.3 


90.5 


95 


92.3 


91.9 


73.9 


62.9 


60 


0 


0 


55.5 


55.3 


55.3 


71.7 


95.5 


95.5 


84.6 


97.8 


97.8 


86.4 


82.6 


91.7 


83.9 


91.8 


80 


100 


100 


100 


100 


90.2 


76 


100 


69.9 


524 


96 


96 


86.9 


89.7 


66.7 


54.5 


75 


74.1 


794 


100 


90 


94.7 


65.1 


85 


85 


55.1 


50 


52.3 


84.5 


79.1 


76.3 
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